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BAYES-P*  SUBSET  SELECTION  PROCEDURES 
FOR  THE  BEST  POPULATION* 

by 

Shanti  S.  Gupta  Hwa-Ming  Yang 

Purdue  University  University  of  Toledo 

ABSTRACT 

Two  new  selection  procedures,  cal  lied  nonrandomi zed  and 
randomized  Bayes-P*  procedures  are  defined  for  selecting  a  small 
nonempty  subset  of  k  populations  which  contains  the  best 
population.  It  is  shown  that  these  procedures  have  some  optimal 
properties.  If  we  restrict  attention  to  the  class  D(D*)  of  all 
nonrandomi zed  (randomized)  selection  procedures,  which  satisfy 
the  PP*-condition,  that  is  the  posterior  probability  of  a  correct 
selection,  for  any  given  observation  X  *  x,  is  not  less  than  P*,  a 

«r 

predetermined  number  between  1/k  and  1,  then  these  two  new  selection 
procedures  are  shown  also  to  be  Bayes  decision  procedures  in  the 
class  D  and  D*  respectively,  provided  that  some  regularity  condi¬ 
tions  are  satisfied.  Robustness  of  these  procedures  and  comparisons 
with  some  other  selection  procedures  are  studied  by  using  Monte 
Carlo  simulations.  ^ _ _ 

1.  INTRODUCTION 

Suppose  we  have  k  populations  whose  distributions 

are  determined  by  unknown  real  parameters  respectively. 

In  a  subset  selection  problem,  the  goal  is  to  select  a  subset  of 

*This  research  was  supported  by  the  Office  of  Naval  Research  Contract 
N00014-75-C-0455  at  Purdue  University.  Reproduction  in  whole  or  in 
part  is  permitted  for  any  purpose  of  the  United  States  Government. 


the  populations  which  Includes  the  population  associated  with  the 
largest  parameter  with  high  probability  and,  possibly,  includes  the 
others  with  low  probabilities.  A  population  it.  will  be  called  the 
best  population  if  ^  ■  j  for  all  j  f  i.  The  remaining  k-1 

populations  will  be  called  non-best.  If  there  are  more  than  one 
population  satisfying  this  condition  we  arbitrarily  tag  one  of  them 
and  call  it  the  best  one. 

A  large  body  of  literature  exists  in  the  area  of  subset  selection 
procedures  (see  Gupta  and  Panchapakesan  (1979)).  As  pointed  out  by 
many  authors  (see,  for  example,  Bahadur  (1950),  Bechhofer  (1954))  the 
testing  of  homogeneity  of  population  means  or  variances  is  not  a 
satisfactory  solution  to  a  comparison  among  several  populations. 

Gupta  (1956,  1965)  gave  a  maximum- type  subset  selection  procedure. 

The  maxi mum- type  procedures  present  a  direct  and  relatively  efficient 
way  to  meet  our  goal.  Gupta  and  Hsu  (1978)  studied  the  performance 
of  maximum-type  procedure,  average-type  procedure  (Seal  (1955,  1957)) 
and  Bayes  procedures.  Berger  (1979)  and  Berger  and  Gupta  (1980)  proved 
that  the  maximum-type  procedure  is  minimax  under  certain  loss  functions. 
The  LFC  (least  favorable  configurations)  of  the  maximum- type  procedure 
usually  occurs  when  the  distributions  are  identical,  i.e.,  under  the 
hypothesis  of  homogeneity.  As  usual,  in  many  cases,  the  hypothesis 
of  homogeneity  is  rejected  at  some  small  significance  level.  It  seems 
then  that  the  maximum-type  procedure  is  still  conservative.  Therefore 
we  may  wish  to  relax  (modify)  the  so-called  Recondition.  On  the 
other  hand,  in  the  decision-theoretic  approach,  Bayes  procedure  always 
gives  us  a  most  economic  decision  under  a  certain  loss;  however,  this 


does  not  mean  that  its  quality  is  good  enough  to  pass  a  certain  level. 
Suppose  the  loss  function  is  a  linear  combination  of  L . ( ^ ) ,  i  =  l,...,k, 
where  L^(e)  is  the  loss  if  the  ith  population  is  selected  in  the  subset, 
as  was  assumed  by  Bahadur  and  Goodman  (1952),  Dunnett  (1960),  Lehmann 
(1966/,  Eaton  (1967)  and  Alam  (1973).  As  pointed  out  by  Goel  and 
Rubin  (1977),  the  decision- theoretic  procedures  mentioned  above  do  not 
seem  to  be  appropriate,  mainly  because  they  ignore  a  reasonable  component 
of  loss  which  depends  on  whether  or  not  the  selected  subset  contains  the 
best  population  and  secondly  because  they  specify  the  subset  size  in 
advance,  whereas  it  should  depend  on  the  information  available  from 
the  sample.  We  may  use  some  loss  functions  that  involve  an  additional 
component,  such  as  the  loss  function  given  by  Gupta  and  Hsu  (1978),  which 
is  associated  with  the  probability  of  incorrect  selection,  or  the  one 
given  by  Goel  and  Rubin  (1977),  which  is  associated  with  the  distance 
between  the  selected  subset  and  the  best  population,  or  some  others 
which  are  proposed  by  Chernoff  and  Yahav  (1977),  Bickel  and  Yahav  (1977) 
and  Kim  (1979),  to  improve  the  quality  of  decision.  However,  the 
results  are  quite  sensitive  to  the  weights  of  each  components,  or 
equivalently,  the  ratio  of  the  coefficient  of  the  two  components.  In 
practice  we  always  have  some  difficulties  in  figuring  out  the  ratio 
whenever  the  two  components  of  loss  are  not  comparable,  or  they  are 
comparable  but  the  ratio  is  not  a  constant  function  or  it  is  not 
completely  known.  In  these  situations  we  may  wish  to  try  some  other 
methods  of  attack.  *1 


For  guaranteeing  the  quality  of  selection  procedures,  we  would 
like  to  have  a  'quality  control'  about  the  class  of  all  possible 
selection  procedures,  that  is,  any  selection  procedure  with  lower 
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quality  will  be  removed,  even  though  it  might  be  the  cheapest  one  under 
some  loss.  By  using  the  PP*-condition  (defined  in  Section  2)  as  a 
filter  or  control  condition  we  get  a  class  of  selection  procedures 
D  and  its  randomized  version  D*.  The  PP*-condition  represents  the 
minimum  quality  (accuracy)  level  of  a  selection  procedure  under  certain 
prior  information.  We  try  to  derive  one  of  these  procedures,  which 
gives  the  minimum  risk  under  a  large  family  of  loss  functions  and  has 
properties  which  we  think  an  optimal  selection  procedure  should  have. 

In  Section  2  we  define  some  notations,  the  PP*- condition 

(posterior-P*-condition) ,  class  D  and  D*  and  propose  two  selection 

B  b*  B 

procedures  <p  and  ijr  .  Some  selection  procedures  close  to  0  but 

restricted  to  normal  populations  were  studied  by  Roth  (1978)  and  Naik 
(1978).  The  optimal  properti«s  of  ip  and  ip  in  D  and  D*,  respectively, 

such  as  ordered,  justness,  most  efficient  and  Bayes  with  respect  to  a 
large  family  of  loss  functions  are  shown  in  Section  3.  Their  applica¬ 
tion  to  normal  distributions  is  discussed  in  Section  4.  In  Section  5 

g 

procedure  &  is  compared  with  the  maxi  mum- type  procedure.  In  Section  6 
we  discuss  their  applications  to  the  selection  problem  for  Poisson 
distributions  and  Poisson  processes  and  their  relation  to  the  selection 
of  gamma  distributions.  Section  7  deals  with  comparisons  of  the 
performance  of  selection  procedures  ip  ,  <p  ,  ^  and  ^  .  Here 

ur  n 

and  ij,  are  the  maximum- type  selection  procedures  based  on  sample 
means  and  sample  medians,  respectively  (see  Gupta  (1956,  1965)  and  Gupta 
and  Singh  (1980)).  The  comparisons  are  based  on  Monte  Carlo  studies. 
Robustness  of  these  four  procedures  is  studied  in  terms  of  the 
expected  size  of  the  selected  subset  and  the  efficiency  (defined  in 
Section  3)  where  the  robustness  is  in  the  sense  of  the  effect  on  the 
performance  of  the  procedure  when  the  k  true  distributions  are  not 


normal  but,  say,  logistic,  double  exponential  distribution  or  the 
contaminated  distribution  (gross  error  model)  (Tukey  (I960)).  Further 
discussion  about  the  Bayes-P*  selection  procedures  is  given  in  Section 
8. 


2.  NOTATION  AND  FORMULATION 

Assume  that  we  have  n.  independent  observations  X^,  j  =  l,...,ni 
for  population  i  =  l,...,k.  Let  Xi  =  T.. (X.. 1 ,. . . ,Xin  )  be  a  suitable 

estimator  of  9.,  i  =  l,...,k;  assume  that  X^'s  are  independently 
distributed.  Usually  X^  is  a  sufficient  statistic  for  e..  Let 
e  =  ( o  i , . . .  ,0^)  60c  R  and  let  X  =  (X^,...,Xk)  with  cumulative 
distribution  function  (cdf)  F(xje)  and  density  (frequency)  f(x|e).  A 
selection  procedure  will  be  denoted  by  t|»(x)  *  (^1(x),...,*k(x))  where 

L 

i|^(x):  R  -*•  [0,1]  is  the  probability  that  is  included  in  the  selected 
subset  when  X  =  x  is  observed.  A  selection  procedure  is  called 
nonrandomi zed  if  all  ij^'s  are  0  or  1 ,  otherwise,  it  is  a  randomized 
procedure.  A  correct  selection  (CS)  is  defined  to  be  the  selection 
of  any  subset  that  includes  the  best  population.  Suppose  we  have  a 
prior  distribution  t  for  0  =  (e^,...,ek)  and  our  control  condition  is 
that  for  any  given  observation  the  posterior  probability  of  CS  must  be 
greater  than  or  equal  to  P*,  a  preassigned  value  between  1/k  and  1. 

That  is 


(2.1) 

where 


P(eSj*,X  =  x)  =  j  ^.(x)p.(x)  _>  P*,  for  all  x, 
i=l  1  ‘  1  " 


(2.2)  P.j(x)  =  P(".j  ’s  the  best|X  =  x)  =  P(n.  is  the  largest|X  =  x). 


For  convenience,  we  now  assume  that  posterior  cdf  of  0  is  absolutely 
continuous.  Then  it  is  clear  that 


hence  this  kind  of  selection  procedures  always  exist.  Let 


£•••£  P[k ](- >  be  the  ordered  P-j(?)'s  and  let  ^(i)  be 
the  population  associated  with  pj--jj(x),  i  =  l,...,k,  then  a  subset 
selection  procedure  is  completely  specified  by  {^(i)**  -  *  »^(k)^ 
where  ^ ^  is  defined  by 

(2.3)  i|/^(x)  =  p(1T(^)  is  selected)#,  X  =  x),  i  =  l,...,k. 

Definition  2.1.  Given  a  number  P*,  1/k  <  P*  <  1,  and  a  prior  t,  we 
say  a  selection  procedure  #  satisfies  the  PP*- condition  (posterior- 
P*-condition)  if 

(2.4)  #^(x)  =  1  and  p(CS|^»,  X  =  x)  >_  P*  for  all  x. 


Remark  1.  The  PP*-condition  implies  the  expected  probability  of  CS 
with  respect  to  a  given  prior  is  not  less  than  P*.  Since  the  prior 
information  is  used  in  PP*-condition,  it  is  different  from  the  usual 
so-called  P*-condition. 

Given  a  prior  t,  let  D  =  D(t,P*)  (D*  =  D*(t,P*))  be  the  class  of 

all  nonrandomi zed  (randomized)  selection  procedures  in  which  all 

procedures  satisfy  the  PP*-condition  for  any  given  observation  X  =  x. 

B  b* 

Now,  we  propose  two  selection  procedures  #  and  #  as  follows: 


Definition  2.2.  Given  a  number  P*(l/k  <  P*  <  1),  an  observation 

D 

X  =  x  and  a  prior  t,  the  selection  procedure  #  is  defined  by 
B  B 

{'!>(  i ) » •  •  •  »^(  k) }  where 


(?) 


1 ,  if  i  i  j 
0,  otherwise 


and  j  is  the  largest  integer  between  1  and  k,  such  that 
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i.  P[i](s)  l  p*- 

1=3  L  J 


Definition  2.3.  Given  a  number  P*(l/k  <  P*  <  1),  an  observation 

B* 

X  =  x  and  a  prior  t,  the  randomized  selection  procedure  i>  is 
B*  B* 

defined  by 
where 


B* 


*(k)^ 


(x)  =  1, 


and 


H,  if  j.p(-ij(x)  ^  P*.  j  t  k, 


i=J 


B* 

"(j) 


(x)  =  <J 


v,  if 


^J+Ip[,](x)  <  P*  and 

'X*iW  '  p*- 


•  0,  otherwise. 


where  the  constant  v  is  determined  so  that 


,pm(*)  *  iijt,p[«i(s) ' p*- 0  ‘ v  ‘  '• 


Example.  If  k  =  3,  P*  =  0.90  and  the  posterior  probabilities  are: 

B* 

p 1 { x )  =  .05,  p2(x)  =  .80,  p3(x)  =  .15,  then  selection  procedure  * 
will  select  the  population  *2  (corresponding  to  P[3](x))  with 
probability  1,  and  select  *3  with  probability  v  where  v  is  given  by 

. 15v  +  .80  =  .90 
v  =  .10/. 15  =  2/3. 

R  R* 

By  definition  ^  (41  )  is  an  element  of  D  (D*).  In  the  next 

section,  we  will  show  some  properties  for  both  selection  procedures 

D  D  * 

v  and  •;  .  They  are  properties  of  ordering,  justness,  efficiency 


8 


and  Bayes  in  its  class,  respectively. 

3.  OPTIMAL  PROPERTIES 

B  a* 

In  this  section  some  properties  of  selection  procedures  ip  and  p 
are  studied. 

I  Definition  3.1.  A  selection  procedure  <j»  is  called  ordered  if  for 

every  x  €  R  ,  x.  <  x.  implies  <|».(x)  <  tji.(x).  It  is  called  just  if 
<  3  •  -  ~  J  - 

L 

for  every  i  =  l,...,k,  and  x,  x’  €  R  ,  ij>.(x)  <_  if^(x')  whenever  x^  <_ 

x-  >_  x . '  for  any  j  f  i . 

J  J 

Just  procedures  were  defined  and  investigated  in  more  generality 
by  Nagel  (1970)  and  Gupta  and  Nagel  (1971). 

Definition  3.2.  A  selection  procedure  is  translation  invariant  if 
for  every  x  €  R  ,  for  every  c  6  R,  'J'jU+cl)  *  ^(x)  for  every  i  = 
l,...,k,  where  1  =  (1 .... *1). 

Lemma  3.1.  (Berger  and  Gupta  (1980))  A  selection  procedure 
'I'M  =  (^t»i ( x) , . . .  ,^»k(x) )  is  just  and  translation  invariant  if  and 
only  if  the  following  two  conditions  hold: 

(1)  for  every  i  =  l,...,k,  ^  is  a  function  only  of  the  set  of 
differences  {x^.-x^  | j  =  l,...,k,j  f  i > ,  and 

(2)  if  x  and  %  satisfy  x.-x.  <  y.-y.  for  every  j  f  i,  then 

J  '  J  * 

*f(x)  >_  ^(jr). 

Let  p  =  (p^,...,pk),  where  p/s  are  defined  by  (2.2).  If  we 

D  Dt 

treat  p  as  a  selection  procedure,  then  <ji  U  )  is  ordered,  just  and 
translation  invariant  if  and  only  if  p  is  ordered,  just  and  translation 
invariant,  respectively.  Therefore,  we  have  Theorem  3.1. 
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B  b* 

Theorem  3.1.  Selection  procedure  \ p  and  ^  are  just  and  translation 
invariant  if 

(1)  for  every  i  =  l,...,k,  =  p^(x)  is  a  function  only  of  the 

set  of  differences  tx.-x. jj  =  l,...,k,  j  t  if,  and 

J  * 

(2)  if  x  and  y  satisfy  x.-x.  _<  y  .-y.  for  every  j  +  i,  then 

J  '  J  ' 

p,-(x)  >  p^y)- 

Some  sufficient  conditions  for  p  to  be  ordered,  just  or 
translation  invariant  are  given  below: 


Theorem  3.2.  Let  H( o | x)  be  the  posterior  cdf  of  o,  given  X  =  x. 

If  H ( h | x )  is  absolutely  continuous  and  has  the  generalized  stochastic 
increasing  property  (GSIP),  that  is: 


k 

(1)  H ( o | x )  =  n^H.(e..|x),  Hi ( - 1 x)  =  posterior  cdf  of  »<_. . 

(2)  H . (t | x)  >  H.(tjx)  for  any  t,  whenever  x.  x-. 

'  J  ”  I  J 

B  R* 

Then  both  and  are  ordered  and  just.  If,  in  addition,  H.  has 


location  parameter  x^  that  is,  H..(e^|x)  =  H-te^-x^)  for  every 

B  b* 

i  =  l,...,k,  then  both  iJj  and  are  also  translation-invariant. 


Proof:  The  first  part  can  be  proved  by  using  integration  by 
parts.  Since  x.  <_  x.,  implies  H.(t|x)  ^H.(t|x)  for  all  t,  hence 

*  v)  *  J 

P,(x)  *  Ke,  -  »[k]|x) 

*/  n  Hm(t|x)(«i(t|x) 
m^i 

<_!  !!  H_ ( t !  x )  dH  •  ( t  j  x ) 

ftfj 

=  1  -  /  H . ( t j x)d[  n  H  (t | x) ] 

1  m  ‘ 

1  1  -  /  H.(t|x)d[  n  H  ( 1 1  x)  ] 

J  "  mn  m  - 


B  B* 

Therefore,  both  ^  and  0  are  ordered.  The  proof  of  justness  is  similar 
to  the  above  and  hence  omitted.  The  proof  of  the  second  part  is  obvious. 

Let  G  denote  the  group  of  all  permutations  of  the  components  of  a 
k-component  vector.  A  set  Sc  R  is  called  symmetric  if  gS  =  S  for 
all  g  €  G.  A  distribution  H  is  called  symmetric  if  H(S)  =  H(gS)  for 
all  measurable  set  S  and  g  6  G.  A  family  of  distributions  P  is 
called  invariant  with  respect  to  G  if  PQ(S)  =  Pg0(gS)  for  all 
measurable  set  S  and  g  €  G. 

Given  an  observation  X  =  x,  suppose  set  s  is  the  selected  subset 
under  a  selection  procedure  ij>  in  D,  then  the  loss  can  be  described  by 
a  non-negative  real-valued  function  L(e,s)  which  has  the  properties 
below: 

Definition  3,3.  For  all  g  €  G,  0  e  0,  a  loss  function  L  has  property 
T  if  and  only  if 

(1)  L(e,s)  =  L(ge,gs) , 

(2)  L ( e , s )  is  non-increasing  in  0^  for  i  e  s,  and 

(3)  L(@,s)  ^  L(e,s ' ) ,  if  s  c  s'. 

Let  property  T'  indicate  that  the  loss  function  satisfies  the 
first  two  conditions  of  property  T,  namely,  invariance  and 
monotonicity  properties.  The  third  condition  of  ordering  of  property  T 
is  reasonable,  because  the  indirect  loss  of  an  incorrect  selection  is 
controlled  by  the  PP*-condition  and  the  direct  loss  of  a  selected 
subset  is  naturally  more  than  its  subset. 


Example  3. 1 .  The  following  loss  functions  satisfy  property  T: 

a.  L(e,s)  =  s. 

b.  L(e,s)  =  l  L.(e)  where  L - ( e) ,  the  loss  if  the  ith  popula- 

i€s  1  ” 

tion  is  selected  in  the  subset,  is  invariant  and  monotonic,  i.e. 

Lg^(go)  =  L..('i)  for  all  g  €  G,  and  L^e)  L_j  +  -|(n)  whenever  n .  o.  +  ^, 

i  =  l,...,k-l.  A  useful  form  of  this  loss  is  that  L ^ ( 0 )  =  ?(q(  ■  ),>;.) 

which  is  non-increasing  in  o^,  where  q(e)  is  a  real-valued  symmetric 

function  of  9.  For  example,  <t(q(e),e.)  =  c(e[|<]-9-j  )*  >  a  >  0,  C  >  0. 

The  next  theorem  (Theorem  3.3.)  shows  that  under  some  regularity 

B  R* 

conditions  selection  procedure  ^  )  is  Bayes  in  D(D*)  if  the  loss 

function  has  property  T. 


Theorem  3.3.  Suppose  the  prior  distribution  t  is  symmetric  on  d. 
Given  e  €  a,  Xj,...,Xk  are  independently  distributed  and  the  pdf 
f ( x | e )  has  monotone  likelihood  ratio  (MLR)  property.  Then  <|,B(ij/B*)  is 
ordered  and  is  a  Bayes  procedure  in  D  (D*)  provided  that  the  loss 
function  has  property  T. 


Proof:  First  we  need  to  show  that  the  selection  procedure  tr  (vB*)  or 

p  is  ordered.  For  any  i  f  j,  if  x.  <  x.,  let 

J 

3-t  =  (e  e  a f e .  <  o.},  then 

*  J 

Pj(x)  -  Pn.(x) 


b0(I{Oj=°[k]}(-)"I{Oi=O[k]} 

b01(I{9j=o[k]}(')'I{ei=e[k] 


(o))f(x|o)di  (h) 
j(0))(f(x|o)-f(x|9' ))di(-) 


>  0, 


where  b  is  a  normalizing  factor  and  o'  is  obtained  from  by  interchange 
the  components  'i.  and  iu.  The  third  equation  above  is  an  application  of 


the  assumption  that  t  is  symmetric  on  a.  The  last  inequality  is 

based  on  the  fact  that  f(x|o)  -  f(x|o')  is  non-negative  by  the  MLR 

property  of  f(x|o)  and  I.  ,(o)-l  _  ,(<)  is  nonnegative. 

{0j"°[k]}  '  i~  [k]  ~ 

B 

Because  for  any  given  observation  X  =  x,  ijr  always  has  minimum  size 
of  the  selected  subset,  say  m,  in  D.  Therefore,  under  the  assumptions 
and  the  property  T,  the  selection  problem  turns  into  "the  mth  decision 
problem"  as  mentioned  in  Lemma  1  of  Goel  and  Rubin  (1977),  then  by 
Theorem  4.1.  of  Eaton  (1967),  the  result  holds.  The  proof  for 

g* 

procedure  ^  is  analogous,  and  hence  is  omitted. 

B  R* 

Theorem  3.4.  Under  the  assumptions  of  Theorem  3.2.,  iT  (>jr  )  is 
Bayes  procedure  in  D(D*)  provided  that  the  loss  function  has  property 
T. 

B  R* 

Proof;  By  Theorem  3,2.,  ip  (\p  )  is  ordered.  By  an  argument  similar 

to  the  argument  in  Theorem  3.3.,  the  theorem  is  proved. 

For  any  selection  procedure  ♦  e  D,  the  posterior  efficiency  of 
t>,  given  observation  X  =  x,  is  defined  by 

eff  ( J  x)  =  P(CS|*,x)/E(SU,x) 

where  E(S|^,x)  is  the  posterior  expected  size  of  the  selected  subset. 
The  expectation  of  eff(if'|x)  is  the  efficiency  of  procedure  and  is 
denoted  by  eff(i)0.  A  selection  procedure  ii>  €  D  is  called  most 
efficient  (ME)  in  D(D*)  if  effU)  ■  eff(c')  for  all  y '  £  D(D*) . 


8  a* 

Theorem  3.5.  The  selection  procedures  <<>  and  ip  are  ME  in  D  and 
D*,  respectively. 

Proof:  In  D,  given  any  observation  X  =  x,  since 

g 

<p  has  minimum  size  of  selected  subset,  say  m,  any 
selection  procedure  in  D  should  have  its  size  of 
selected  subset  equal  to  m+c  for  some  0  <-  c  ^  k-1.  Now, 

effU'jx)  =  iP[k_m.c+i](x)  +...+  P[-k-j(x)}/(m+c) 

i  {cP[k-m}(?>  *  +---+  P[k](«)>/("«) 

i  +-'+  P[k](J)!/m 

=  eff(^|x), 

D* 

the  first  part  is  proved.  For  ip  in  D*,  the  proof  is  similar  hence  is 
omitted. 

4.  EXTENSION  AND  APPLICATIONS 

In  this  section,  the  formulas  for  the  posterior  probabilities 

B  b* 

which  are  necessary  to  carry  out  the  selection  procedures  ip  and  ^ 

are  given  under  various  assumptions. 

Suppose  we  have  k  populations  ir^ ,. . .  ,n^;  it.  has  normal  distribution 
2 

NCu^o^  )  where  p..'s  are  unknown.  Assume  that  we  have  sample  X^,...,X-n 
for  each  population  n . .  Let  X^  be  the  sample  mean  and  let  X  =  ( X ^ , . . . , ) . 
Suppose  we  are  interested  in  selecting  a  subset  containing  the  population 
having  the  largest  population  mean  under  the  PP*-condition,  with 
respect  to  some  prior  distribution  t  of  p  =  (p-|,...,pk). 

A.  No  Prior  Information 

Under  the  situation  where  very  little  is  known  a  priori,  we  may 
use  a  'non- informative'  prior  (see  Box  and  Tiao  (1973))  provided  that 


the  unknown  parameters  are  locally  independent  a  priori  (see  Guttman 
and  Tiao  (1964)). 


( 


p 

A.l.  Common  Variance  a  (Known)  and  Common  Sample  Size  n 
By  using  the  non-informative  prior  x(u)  «  c,  we  have 

00  i 

(4.1)  P[ i ](x)  =  /  n  4>(t  +  /n  o  (X[^]-X[jj))d*(t),  i  =  l,...,k. 

2 

A. 2.  Unequal  Variance  o^'s  (Known)  and  Unequal  Sample  Size  n^s 
By  using  the  same  non-informative  prior  t(u)  «  c,  we  have 


(4.2)  p.(x)  =  /  n  ♦(tv./vj+(x.-x.)/v.)d4>(t) 

1  -  j^i  1  J  i  J  J 

where  vi  =  o^/Zn. ,  i  =  l,...,k. 


2 

A. 3.  Unequal  Variance  a^‘s  and  Unequal  Sample  Size  n^'s 

By  using  non-informative  prior  t^.c^)  «  ck-1  for  each 
population,  we  have 


»  s.//n. 

(4.3)  p,(x)  *  /  n  T  (t  -j— ♦  -J-J-ldT  It) 

jyi  J  Sj//nj  vj 


where  v^  *  n^-1. 


(4.4) 


ni 

visi  =  ^xij"xi^2,  1  = 


j=l 


and  Ty  is  the  cdf  of  t-disJi  Lion  with  v  degrees  of  freedom. 

For  large  v,  it  can  be  anp  by  the  normal  distribution. 


B.  Independent  Normal  Prioi 
B. 1 .  Identical  Prior 

2 

Assume  i^.'s  have  common  distribution  N(e0,Og)  and  given  M., 

2 

X..  has  distribution  N(ui,a  /n),  then 
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W 


1 


(4.5)  P[i]W  =  /  .n.'t’(t+bno~2(xj.1.-|-X|-jj)d4'(t)1 

00 

2  2  -2  1 

where  b  =  (oQ  +  no  )  ,  i  =  l,...,k. 


B.2.  Non- Identical  Prior 


Assume  ii ^ ' s  have  independent  normal  prior  distribution  N( <*  -  »°oi ) 1 

2 

Since  given  »i . ,  X^  has  normal  distribution  N(ui  ,a,. ./n . ) .  Let 


(4.6) 


(4.7) 


2(xi>  =  bf(CTofei  +  viixi> 

b?  •  (ojf  *  "I?",)'' 


we  have,  for  i  =  1 . k. 


(4.8)  p.(x)  =  /  n  4>[(tb .  +  (z(x.)-z(x.))/b.]d<f(t). 

1  -  j^i  1  1  J  J 

B  b* 

In  Case  A.l.  <jr  and  41  are  just,  translation  invariant  and  ordered, 
hence  these  will  be  a  Bayes  procedure  in  D  and  D*,  respectively, 
provided  loss  function  has  property  T. 


3 


m 


$ 

s’, 


C.  General  Normal  Model 

Suppose  we  have  k  normal  populations  with  common  known  variance 

2 

a  >0  and  common  sample  size  n.  The  observation  can  be  reduced  to 

X  =  (X^,...,X|()  where  X..  is  the  sample  mean  for  population  it...  Assume 

o 

that,  given  u,  X  has  normal  distribution  N(u,vl),  where  v  =  o  /n  and  \> 

2 

itself  has  normal  distribution  N(9gl,rl  +  wU)  with  0g  €  R,  r  =  Og  >  0, 
w  >  -r/k,  1  =  (1,...,1)  and  U  =  l'l.  Note  that  here  r  >  0  and 
w  >  -r/k  are  sufficient  and  necessary  for  rl  +  wU  to  be  positive 
definite.  This  model  was  chosen  by  Chemoff  and  Yahav  (1977)  (t  >  0), 
Gupta  and  Hsu  (1978)  and  Miescke  (1979).  In  this  model,  the  p^(x)'s 
are  exactly  the  same  as  that  of  the  independent  prior  case  B.l. 


uBwsiaiwiCTivimv^-W!!*  i»  j*.  »j  '.a  ■>  77  ?  rrrrrrr^r  •/  «.■  -.*  <r.,v.v-'.vy»  .^~. 


•  •••  '  •  V'1! 
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5.  RELATION  BETWEEN  PROCEDURES  <I-B  AND  <|-M  IN  THE  NORMAL  LOCATION 
PARAMETER  CASE 

Suppose  there  are  k  independent  normal  populations  with  a 
common  known  variance  ac  and  common  sample  size  n.  For  this  case 
Gupta  (1956)  proposed  and  studied  the  maximum- type  procedure 

t|i  r  Select  u.  iff  X-  >  X^^-dcj/v^i,  i  -  l»...»k; 

where  d  =  d(k,P*)  >  0  is  determined  by  the  P*-condition,  that  is, 

(5.1)  /  4>k_1(t+d)d4>(t)  =  P*. 

—  00 

When  the  prior  t  is  the  non-informative  prior,  the  following 

Ij  M 

theorem  shows  that  <T  6  0(t,P*)  and  gives  for  u*  a  lower  bound  for 
the  posterior  probability  of  CS.  Let 

k 

(5.2)  z  -  (all  possible  observed  values}  =  R  , 

(5.3)  (x  €Z|x[k]  -  do/Sn  < 

(5.4)  *•  =  {?  6  Zlxj-^-j  <  Xj-k]-da/^n  <  x^.-j},  2  <  i  <  k, 

(5.5)  zj1*  =  {x  €  %lx[l3  =  xti-1]  <  x[k]-do/^  i  x[i3}  c  ^ 

(5.6)  x(2)  =  {X  €  SlXj-13  =  X^ _i 3  <  x[k]-da//^  =  =  x[k_i;]}  ^  x|1}. 

1  I 

It  is  clear  that  is  a  partition  of  the  sample  spaced. 

Theorem  5.1.  Given  P*(l/k  <  P*  <  1)  and  non-informative  prior  t.  If 
the  observation  X  =  x  6  X-,  then 

(5.7)  P(CS|/,  X  =  x)  >  Q*(i) 
where 

(5.8)  Q*(i )  =  P*  +  (l-P*)((k-i)/(k-l)). 

Therefore,  €  D(t,P*). 


m 


Proof:  Without  loss  of  generality  we  can  assume  o/>^\  =  1.  Since 
P[j](x)  is  nonincreasing  for  all  m^j-1,  given  x  s  5; . ,  we 


have 


P(CS|*M,x)  >  inf  P(CS|Ax) 


xec1 


inf 

x€%i 


=  inf  l  PrnM 

xa|')  [j1 

=  't!n)  t  W-x> 


xeq 


i-l 


=  1-sup  If  n  $(t+xr  rxr,-.)do(t) 

xecP}  m=1  •"  j7m  [  3  [  3 


i-l 


s  1-sup  l  /  {  n  $(t+xr  rxr.1)}i-1'2(t)d'j>(t) 
,(1)  m=l  j>i 


xet] 

i-l 


=  1-  l  /  4-(t-d)4>k"2(t)d$(t) 
m=l  -°° 


=  l-(i-l)/  $(t-d)$k"2(t)do(t) 

—  00 

=  Q*(i). 

(21 

The  supremum  occurs  when  x  6  .  The  last  equality  follows  from 

the  identity 


(5.9) 


k-2 


(k-1 )/  i>  (t)$(t-d)d4-(t) 


=  1-/  *k_1(t+d)d*(t) 


which  can  be  proved  by  integration  by  parts. 
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Remark  5.1.  If  the  procedure  ^  selects  one  population  only,  i.e. 


X  =  x  6  then  by  Theorem  2.5.  We  have  p^j  >  P*,  so  both  procedures 


D  D* 

ii«  and  4>  will  select  the  same  population.  In  other  word,  when  the 


maximum- type  selection  procedure  does  an  excellent  job,  so  does  the 


Bayes-P*  selection  procedures.  However,  the  converse  is  not  true. 


B  B* 

In  general,  under  PP*-condtion,  the  subset  selected  by  tp  or  i ;<  is 


always  smaller  than  the  one  selected  by  ^  ,  see  Roth  (1978)  for  some 


discussion. 


B  M 

Remark  5.2.  For  the  case  k  =  2,  ip  =  <p  a.e.  for  any  given  X  =  x; 


M  B 

if  x  €  %2‘  then  P[2]^-^  51  f>*’  hence  both  ^  and  p  select  the  populatii 


n ^ 2)  which  is  associated  with  x£2]‘  ^  x  e  an(*  x[2]”c*°^y^  <  x[l] 


M  R 

then  v  and  >\>  select  both  populations  and  n*.  Since 


p  M 

P(Xj-2j-da//n  =  x[i])  =  °»  we  have  ^  ^  a.e.. 


Remark  5.3.  Under  the  assumption  of  Theorem  5.1.,  and  from  the  proof 


of  it  we  have  a  lower  bound  on  the  value  of  p^j(x)  +...+  Pj-^U)  for  any 


observation  x£Z|. 


6.  Application  to  Poisson  Distributions  and  Poisson  Processes 


6.1.  Poisson  Distributions  Case 


Suppose  that  1^,...,*^  are  k  independent  Poisson  populations, 
where  the  independent  observations  from  *.  have  the 

Poisson  distribution  with  parameter  denoted  by  P(.|x^),  i  =  l,...,k. 

1 

Under  non- informative  prior  T(x)  «  *  for  each  population,  if  the  best 


population  is  associated  with  the  maximum  parameter,  we  have 


PiU)  =  P(xi  =  X[kj|x) 


(6.1) 


l  A  ^(yn'/nJ)dx"i(y) 
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(6.2) 


oo  _ 

=  /  n  xj  {y)dxj  (y),  if  n.  =  ...=  n. 
0  j/i  mi  1  k 


where 


1 

(6.3)  m.  =  2nixi+l,  x.  =  l  x^./n.. 

J  ^ 


where  xm  is  the  cdf  of  chi-squared  distribution  with  m  degrees  of 


freedom. 


On  the  other  hand,  if  we  are  interested  in  selecting  the  population 


with  the  smallest  parameter  x,  then 


(6.4)  Pi(x)-/  n  [1-x!  (yn,/n.)]dxMy) 

1  0  j*i  mj  J  1  mi 


(6.5) 


*1  jJi[1’xmj(y)]dS(y)  if 


.=  nk. 


B  R* 

For  this  case,  the  simulation  results  for  procedures  *  and  i/r 


are  tabulated  in  Table  5. 


6.2.  Poisson  Processes  Case 


Suppose  we  have  k  independent  Poisson  processes 


(X^(t)},...,{X^(t)}  with  expected  arrival  times  equal  to 


l/x1,...,l/xk,  respectively.  Hence  for  the  processes  {Xl1'(t)l, 


the  probability  that  there  are  m^  arrivals  until  time  t..  is 


I  '  X  "I  . 

(6.6)  P(Xu,(t1)  =  mi  |  x.  ,t^ )  =  (tjAj)  exp(-t.xi)/m. ! . 


If  there  exists  no  prior  information,  then  we  use  the  non- informative 


_  L 

prior  t( x.j )  *  a  for  all  processes.  Let  m  =  (m^,...,mk)  and 
t  =  (t^,...,tk),  it  can  be  shown  that  the  ith  Poisson  processes 


has  the  maximum  parameter,  i.e.  the  minimum  expected  waiting  time. 


given  (m,t)  is 
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(6.7)  P^rn.t)  =  /  n  xL  +1(yt./ti)dx?m  +1(y),  i  =  1 . k. 

1  0  ^'nj  3  1  *"i  1 

Here  we  list  two  special  cases  which  are  of  interest. 

(a)  Observations  m  of  all  processes  are  obtained  in  common  length 
time  intervals  [s..,  tg+s^].  Since  Poisson  process  is  stationary, 

we  can  assume  that  s.  =  0.  In  this  case  p^m.t)  is  independent  of  t. 

(b)  All  m^'s  are  equal  to  mg,  i.e.  we  fix  mg  first,  then  get  observa¬ 
tions,  the  waiting  time,  t.  Let  be  the  waiting  time  of  the  mQth 

arrival  in  the  ith  process,  then  has  a  gamma  distribution  with  pdf 

given  by 


(6.8) 


X.  m.-l  -x .  t. 

t(ti>  =  (x1ti>  e  •  =•  °- 


By  using  the  same  non- informative  prior  for  x  as  before,  we  get 
the  same  formulas  for  p^m.t).  That  is  under  case  (b)  the  selection 
problem  is  identical  with  the  selection  problem  on  populations  with 
gamma  or  exponential  distribution. 


Remark  6.1.  Under  non- informative  prior,  in  comparing  the  subset 
selection  problem  for  k  Poisson  distributions  with  the  problem  for 
k  Poisson  processes,  it  is  easily  seen  that  Poisson  distribution 
model  is  a  special  case  of  Poisson  processes  model,  with,  t^  =  ni 
and  mi  =  n.x. . 

7.  COMPARISON  OF  THE  PERFORMANCE  OF  *B\  *B,  AND  <j,MED 

Let  iry  i  =  l,...,k  be  k  independent  populations,  where  tk  has 
the  associated  cdf  F^(x.e^)  =  F(x-9^)  with  unknown  location  parameter 
•k.  Let  f^(x.e^)  =  f(x-e.)  be  the  pdf.  Suppose  the  goal  is  to  find  a 
small  (nontrivial)  subset  which  contains  the  best. 


l" 
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MED 

The  following  subset  selection  procedure  y  based  on  sample 
medians  is  due  to  Gupta  and  Singh  (1980). 

*MED:  Select  it.  if  and  only  if  Y..  >_  Y^j  -  d' 

where  Y..  is  the  median  of  the  2m+l  random  observations  from  population 
and  Y^  =  max  Y..  The  value  of  d'  is  determined  by  the  following 
equation  so  that  the  P*-condition  is  met. 

(7.1)  /Wk_1(u+d’)A(u)du  =  P* 
where 

(7.2)  A(u)  =  ((2m+l)!/(m!)2)[F(u)]m[l-F(u)]mf(u) 

(7.3)  W(u)  =  IF^uj(m+l,  m+1) 

where  Iy(a,b)  is  the  incomplete  beta  function. 

In  this  section  we  use  Monte  Carlo  simulation  techniques  to 

Q  D*  U  MFD 

compare  the  performance  of  selection  procedures  ^  ^  and  4/ 

M  MED 

in  the  normal  means  problem.  Because  selection  procedures  ^  and  <|» 

are  not  based  on  any  prior  information  about  the  unknown  parameters,  we 

B  b* 

assume  that  the  prior  distribution  t  for  both  procedures  <|»  and  tjr  is 

M 

non- informative.  Since  procedure  <|>  satisfies  both  the  P*-condition 
and  the  PP*-condition  with  respect  to  the  non-informati ve  prior,  it 

R  D*  M 

makes  sense  to  compare  the  Bayes-P*  procedures  ^  and  ijr  with  v  and 
M  MED 

compare  tp  with  ^  in  terms  of  efficiency.  Furthermore,  we  assume 
the  true  distributions  to  be  non-normal  distributions,  namely,  the 
logistic,  Laplace  (the  double  exponential)  and  the  gross  error  model 
(the  contaminated)  distribution,  but  keep  the  selection  procedure 
unchanged  (i.e.  still  based  on  the  normal  assumption)  and  compute  the 
efficiency.  The  Monte  Carlo  simulation  results  for  both  equal 
distances  of  the  parameters  and  slippage  cases  are  tabulated.  In  the 


simulation  study  all  generated  random  variables  are  adjusted  to  have 
variance  one.  Each  time  five  random  numbers  with  indicated  distribution 
were  generated  for  each  population.  All  four  procedures  are  applied  to 
the  same  data.  The  simulation  process  was  repeated  one  hundred  times. 
The  relative  frequency  of  selecting  population  is  used  as  an 
approximation  to  the  probability  of  selecting  population  u..  The  sum 
of  relative  frequency  of  selecting  each  population  is  treated  as  an 
approximation  of  the  expected  selected  size.  The  efficiency  (EFF)  of 
each  selection  procedure  is  approximated  by  the  ratio  of  relative 
frequency  of  selecting  the  best  one  to  the  expected  selection  size. 

The  simulation  results  indicate  that  in  all  cases  we  have  the 
performance 

(7.4)  „B*  >  g,B  *  *M. 

where  the  symbol  V'  stands  for  better  than. 

M 

For  small  sample  size,  the  efficiency  of  rule  tends  to  be 
MED 

larger  than  4,  under  the  P*-condition. 

Remark  7.1.  The  gross  error  model  we  used  has  the  density  function 

(7.5)  f(x-e)  =  (l-c)cp(x-o)  +  |  c  =  .15 

for  which  cp  is  the  pdf  of  N(0,1)  and  the  variance  for  this  distribution 
is  (1-c)  +  16c  =  3.25. 

In  the  tables,  the  efficiency  (EFF)  of  a  procedure  given 
parameter  e,  is  defined  by 

(7.6)  EFF((4)  =  Pf)(CS  |i|,)/Ef)(S  |(|») 

where  E  (SU)  is  the  expected  selected  size. 

0 


Discussion  of  the  Tables: 


For  Table  1  and  Table  2  (equal  distances  case)  the  value  of  P*  is 
.99  and  .90  respectively,  the  common  sample  size  n  =  5,  and  k  =  5.  If 
the  k  populations  have  normal  distributions  with  the  unknown  parameter 
configuration  (o,  g+a,...,o  +  ( k- 1 ) A )  and  common  variance  one,  then 
from  both  tables  the  performance  based  on  either  the  efficiency  or 
the  expected  selected  size  is 


(7.7) 


B* 

'll  > 


tB> 


,  M 


if  the  PP*- condition  is  considered;  and 


(7.8) 


,M 

'J'  > 


MED 

<P 


under  the  P*- condition. 

When  the  true  distributions  are  not  normal,  but  the  logistic, 

Laplace  or  the  gross  error  model,  the  simulation  results  are  very 

close  to  the  normal  case.  This  suggests  the  four  procedures  are 

reasonably  robust.  From  Table  2  all  efficiencies  are  larger  than  the 

corresponding  ones  in  Table  1.  This  is  to  be  expected  because  the 

value  of  P*  is  smaller  in  the  second  table. 

For  Table  3  and  Table  4  (slippage  case)  the  value  of  P*  is 

.99  and  .90  respectively,  the  common  sample  size  n  =  5,  and  k  =  5. 

If  the  k  populations  have  normal  distributions  with  unknown  parameter 

configurations  (e,...,e,  g+a)  and  common  variance  one,  then  from 

both  tables  the  performance  is  the  same  as  the  equal  distances  case. 

Note  that  in  both  equal  distances  and  slippage  cases  when  A*^n  >  1, 

that  means  the  largest  population  mean  and  the  second  largest  population 

B  B* 

mean  are  not  very  close,  the  Bayes-P*  selection  procedures  ijr  and  ^  , 

with  respect  to  the  locally  uniform  priors,  always  satisfy  not  only  the 
PP*-condition  but  also  the  estimated  P(CS)  P*,  and  the  expected 
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selected  size  of  the  Bayes-P*  procedures  is  much  less  than  the 

selection  procedures  and  t/>MEB.  For  example,  in  the  normal  equal 

distance  case,  P*  =  .99,  k  =  5  and  A/n  =  4,  we  have 
E(Sj/ED)  -  E(S|*B)  =  .370; 

In  the  normal  slippage  case,  P*  =  .99,  k  =  5  and  A/n  =  4,  we  have 
E(S|*MED)  -  E(S|«pB)  =  1.560. 

8.  DISCUSSION: 

B  Q* 

The  Bayes-P*  selection  procedures  ijr  and  ^  are  highly- 

efficient  and  have  the  following  advantages. 

a.  These  procedures  can  apply  to  any  family  of  distributions,  even 
their  mixtures,  and  do  not  need  equal  sample  size  for  each 
population. 

b.  Good  prior  information  will  not  be  ignored.  Even  under  non- 
informative  situation,  they  still  perform  well. 

c.  They  are  robust  in  terms  of  the  loss  function.  We  do  not  even 
need  to  specify  or  to  know  the  exact  form  of  the  loss  function 
before  we  make  a  decision.  There  will  automatically  be  a  Bayes 
decision  procedure  under  the  control  condition  and  the  assumptions 
given  by  Theorem  3.3.  and  Theorem  3.4. 

d.  The  weight  or  contribution  of  each  population  in  the  selected 
subset  is  known. 

e.  Compared  with  the  classical  maximum-type  or  average-type  selection 
procedures,  the  Bayes-P*  selection  procedure  is  less  sensitive  to 
the  total  number  of  populations.  For  example,  in  the  normal  case, 
if  there  are  some  newly  added  populations  with  very  small  sample 
means,  from  Section  4  we  can  see  that  the  selected  subset  for 


Bayes-P*  selection  procedure  is  nearly  unchanged;  however,  the 
selected  subset  for  the  classical  procedures  may  increase  rapidly, 
f.  Based  on  the  Simulation  results  of  Section  7,  Bayes-P*  selection 
procedure  is  robust  if  the  true  family  of  distributions  for  each 
population  is  symmetric. 

The  only  disadvantage  in  using  the  proposed  selection  procedures 
is  that  the  computation  of  the  posterior  probabilities  needs  more  work 
than  the  classical  selection  procedures  which  can  use  some  precalculated 
tables;  however,  this  disadvantage  can  be  offset  by  the  use  of  computers 
In  fact,  we  need  not  evaluate  all  p.'s,  but  only  a  few  of  the  large  ones 


TABLE  1 


Efficiency  (EFF)  and  expected  selected  size  (ES)  (based  on 

simulation)  of  ,  ip  and  ,  under  normal  assumption,  when 

the  unknown  means  of  the  k  populations  are  0,0+A,. . . ,0+U-1)a; 
the  common  variance  =  1,  common  sample  size  n  =  5  and  the  prior  for 
R  R* 

/  and  <i/  is  the  non-informative  prior. 


normal 

logistic 

Laplace 

gross  error 

EFF  ES 

EFF  ES 

EFF  F.S 

EFF  ES 

.  254 

3.809 

4.110 

.250 

3 

.238 

•  Cm  OO 

4 

.208 

4.810 

.207 

4 

.203 

4.810 

.202 

4 

j  .541 
i  .430 
'  .417 


1.884 

.559 

1.778 

2.010 

.510 

1.360 

2.410 

.437 

2.290 

2.760 

.363 

2.720 
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TABLE  2 

Efficiency  (EFF)  and  expected  selected  size  (ES)  (based  on 
B*  B  |yj  MED 

simulation)  of  t|f  ,  ^  ,  4/  and  4/  c  ,  under  normal  assumption,  when 
the  unknown  means  of  the  k  populations  are  n,o+A,...,o+(k-1)A);  the 

common  variance  =  1,  common  sample  size  n  =  5  and  the  prior  for 

B*  D 

ip  and  4;  is  the  non-informative  prior. 


k  =  5, 


normal 
EFF  ES 


P*  =  .90 


logistic 
EFF  ES 


Laplace 
EFF  ES 


.334 

2.449 

.353  ; 

.303 

2.870 

.313  ; 

.239 

4.020 

.252 

.226 

4.200 

.231  - 

.493 

1.807 

.523 

.444 

2.160 

.454  ; 

.327 

3.030 

.331 

.297 

3.270 

.287 

.773 

1.275 

.770 

.671 

1.490 

.678 

.535 

1.870 

.541 

.490 

2.040 

.490  ! 

2.359 

2.780 

3.930 

4.240 


gross 

error 

EFF 

ES 

.333 

2.611 

.310 

3.030 

.231 

4.280 

.215 

4.650 

TABLE  3 


Efficiency  (EFF)  and  expected  selected  size  (ES)  (based  on 

B*  g  fvj  urn 

simulation  of  ,  i|>  ,  41  and  ^  ,  under  normal  assumption,  when 

the  unknown  means  of  the  k  populations  are  e . e,e+A;  the  contnon 

D 

variance  =  1,  common  sample  size  n  =  5  and  the  prior  for  and 
B* 

il>  is  the  non- informative  prior. 


logistic 


gross  error 


4.192 

.218  4 

.204 

4.520 

.212  4 

.201 

4.970 

.202  4 

.203 

4.930 

.202  4 

.228 

4.101 

.240  4 

.220 

4.400 

.224  4 

.202 

4.940 

.203  4 

.204 

4.890 

.201  4 

.253 

.580 

4.950 

4.940 


4.900 

4.960 


3.598 

.27 

3.970 

.25: 

.214 

4.680 

.211 

4.750 

.214  i 

.208 

4.800 

.207 

4.840 

.203  ' 

3.750 


1.893 

2.110 

2.810 

3.750 
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TABLE  4 

Efficiency  (EFF)  and  Expected  selected  Size  (ES)  (based  on 
g*  0  |y|  MED 

simulation  of  ^  ,  ip  ,  ip  and  ip  c  ,  under  normal  assumption  when 

the  unknown  means  of  the  k  populations  are  the  common 

8  d* 

variance  =  1,  common  sample  size  n  =  5  and  the  prior  for  i|)  and 
is  the  non- informative  prior. 

k  =  5,  P*  =  .90 


!j 


TABLE  5 


O*  D 

For  procedure  <j/  and  t p  and  the  parameter  configurations 


(.5,l,...,.5k)  of  k  Poisson  populations,  this  table  gives  the 
values  (based  on  simulation)  of  the  probability  or  selecting  the 
population  with  parameter  .5i,  i  =  l,...,k  and  the  expected  selected 
size  ES.  The  prior  distribution  for  each  population  is  t(a)  «  x_1£. 

n  =  10 


.B* 

<P 

,8* 

>J> 

*B 

,B* 

i|< 

*B 

.  B* 

ip 

B 

V 

.933 

1.000 

.966 

.990 

.950 

.990 

.946 

.990 

.670 

.820 

.450 

.670 

.324 

.490 

.110 

.190 

1.663 

1.820 

1.415 

1.660 

1.274 

1.480 

1.005 

1.180 

.998 

1.000 

.990 

1.000 

.967 

1.000 

.929 

1.000 

.741 

.850 

.338 

.630 

.321 

.510 

.171 

.280 

.205 

.330 

.120 

.130 

.065 

.110 

.021 

.050 

1.944 

2.180 

1.498 

1.760 

1.354 

1.620 

1.121 

1.330 

.988 

1.000 

.981 

1.000 

.966 

.990 

.911 

.950 

.754 

.820 

.560 

.730 

.250 

.400 

.175 

.290 

.369 

.490 

.070 

.140 

.067 

.100 

.025 

.050 

.075 

.110 

.041 

.060 

.024 

.040 

0 

0 

2.187 

2.420 

1.653 

1.930 

1.316 

1.530 

1.112 

1.290 

.996 

1.000 

.981 

.990 

.985 

1.000 

.950 

.980 

.737 

.850 

.503 

.650 

.367 

.550 

.128 

.260 

.355 

.470 

.102 

.160 

.133 

.210 

.013 

.030 

.067 

.090 

.015 

.030 

.013 

.020 

.006 

.010 

0 

0 

.006 

.010 

0 

0 

0 

0 

2.154 

2.410 

1.607 

1.840 

1.498 

1.780 

1.097 

1.280 
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